Goto

Collaborating Authors

 Guizhou Province



Thousands of Companies Are Driving China's AI Boom. A Government Registry Tracks Them All

WIRED

Thousands of Companies Are Driving China's AI Boom. How the Cyberspace Administration of China inadvertently made a guide to the country's homegrown AI revolution. When DeepSeek burst onto the global stage in January 2025, it seemed to appear out of nowhere. But the large language model was just one of the thousands of generative AI tools that have been released in China since 2023--and there's a public archive of every single one of them. Here are 23 ways China is rewiring the future .


FreDN: Spectral Disentanglement for Time Series Forecasting via Learnable Frequency Decomposition

An, Zhongde, You, Jinhong, Li, Jiyanglin, Tang, Yiming, Li, Wen, Du, Heming, Du, Shouguo

arXiv.org Machine Learning

Time series forecasting is essential in a wide range of real world applications. Recently, frequency-domain methods have attracted increasing interest for their ability to capture global dependencies. However, when applied to non-stationary time series, these methods encounter the $\textit{spectral entanglement}$ and the computational burden of complex-valued learning. The $\textit{spectral entanglement}$ refers to the overlap of trends, periodicities, and noise across the spectrum due to $\textit{spectral leakage}$ and the presence of non-stationarity. However, existing decompositions are not suited to resolving spectral entanglement. To address this, we propose the Frequency Decomposition Network (FreDN), which introduces a learnable Frequency Disentangler module to separate trend and periodic components directly in the frequency domain. Furthermore, we propose a theoretically supported ReIm Block to reduce the complexity of complex-valued operations while maintaining performance. We also re-examine the frequency-domain loss function and provide new theoretical insights into its effectiveness. Extensive experiments on seven long-term forecasting benchmarks demonstrate that FreDN outperforms state-of-the-art methods by up to 10\%. Furthermore, compared with standard complex-valued architectures, our real-imaginary shared-parameter design reduces the parameter count and computational cost by at least 50\%.


GRPO-GCC: Enhancing Cooperation in Spatial Public Goods Games via Group Relative Policy Optimization with Global Cooperation Constraint

Yang, Zhaoqilin, Li, Chanchan, Liu, Tianqi, Zhao, Hongxin, Tian, Youliang

arXiv.org Artificial Intelligence

Inspired by the principle of self-regulating cooperation in collective institutions, we propose the Group Relative Policy Optimization with Global Cooperation Constraint (GRPO-GCC) framework. This work is the first to introduce GRPO into spatial public goods games, establishing a new deep reinforcement learning baseline for structured populations. GRPO-GCC integrates group relative policy optimization with a global cooperation constraint that strengthens incentives at intermediate cooperation levels while weakening them at extremes. This mechanism aligns local decision making with sustainable collective outcomes and prevents collapse into either universal defection or unconditional cooperation. The framework advances beyond existing approaches by combining group-normalized advantage estimation, a reference-anchored KL penalty, and a global incentive term that dynamically adjusts cooperative payoffs. As a result, it achieves accelerated cooperation onset, stabilized policy adaptation, and long-term sustainability. GRPO-GCC demonstrates how a simple yet global signal can reshape incentives toward resilient cooperation, and provides a new paradigm for multi-agent reinforcement learning in socio-technical systems.



RMAU-NET: A Residual-Multihead-Attention U-Net Architecture for Landslide Segmentation and Detection from Remote Sensing Images

Pham, Lam, Le, Cam, Tang, Hieu, Truong, Khang, Nguyen, Truong, Lampert, Jasmin, Schindler, Alexander, Boyer, Martin, Phan, Son

arXiv.org Artificial Intelligence

In recent years, landslide disasters have reported frequently due to the extreme weather events of droughts, floods , storms, or the consequence of human activities such as deforestation, excessive exploitation of natural resources. However, automatically observing landslide is challenging due to the extremely large observing area and the rugged topography such as mountain or highland. This motivates us to propose an end-to-end deep-learning-based model which explores the remote sensing images for automatically observing landslide events. By considering remote sensing images as the input data, we can obtain free resource, observe large and rough terrains by time. To explore the remote sensing images, we proposed a novel neural network architecture which is for two tasks of landslide detection and landslide segmentation. We evaluated our proposed model on three different benchmark datasets of LandSlide4Sense, Bijie, and Nepal. By conducting extensive experiments, we achieve F1 scores of 98.23, 93.83 for the landslide detection task on LandSlide4Sense, Bijie datasets; mIoU scores of 63.74, 76.88 on the segmentation tasks regarding LandSlide4Sense, Nepal datasets. These experimental results prove potential to integrate our proposed model into real-life landslide observation systems.


DocShaDiffusion: Diffusion Model in Latent Space for Document Image Shadow Removal

Liu, Wenjie, Wang, Bingshu, Wang, Ze, Chen, C. L. Philip

arXiv.org Artificial Intelligence

Document shadow removal is a crucial task in the field of document image enhancement. However, existing methods tend to remove shadows with constant color background and ignore color shadows. In this paper, we first design a diffusion model in latent space for document image shadow removal, called DocShaDiffusion. It translates shadow images from pixel space to latent space, enabling the model to more easily capture essential features. To address the issue of color shadows, we design a shadow soft-mask generation module (SSGM). It is able to produce accurate shadow mask and add noise into shadow regions specially. Guided by the shadow mask, a shadow mask-aware guided diffusion module (SMGDM) is proposed to remove shadows from document images by supervising the diffusion and denoising process. We also propose a shadow-robust perceptual feature loss to preserve details and structures in document images. Moreover, we develop a large-scale synthetic document color shadow removal dataset (SDCSRD). It simulates the distribution of realistic color shadows and provides powerful supports for the training of models. Experiments on three public datasets validate the proposed method's superiority over state-of-the-art. Our code and dataset will be publicly available.


DeepSelective: Interpretable Prognosis Prediction via Feature Selection and Compression in EHR Data

Zhang, Ruochi, Yang, Qian, Wang, Xiaoyang, Wang, Tian, Zhou, Qiong, Deng, Ziqi, Li, Kewei, Wang, Yueying, Fan, Yusi, Zhang, Jiale, Huang, Lan, Liu, Chang, Zhou, Fengfeng

arXiv.org Artificial Intelligence

The rapid accumulation of Electronic Health Records (EHRs) has transformed healthcare by providing valuable data that enhance clinical predictions and diagnoses. While conventional machine learning models have proven effective, they often lack robust representation learning and depend heavily on expert-crafted features. Although deep learning offers powerful solutions, it is often criticized for its lack of interpretability. To address these challenges, we propose DeepSelective, a novel end to end deep learning framework for predicting patient prognosis using EHR data, with a strong emphasis on enhancing model interpretability. DeepSelective combines data compression techniques with an innovative feature selection approach, integrating custom-designed modules that work together to improve both accuracy and interpretability. Our experiments demonstrate that DeepSelective not only enhances predictive accuracy but also significantly improves interpretability, making it a valuable tool for clinical decision-making. The source code is freely available at http://www.healthinformaticslab.org/supp/resources.php .


VidEvent: A Large Dataset for Understanding Dynamic Evolution of Events in Videos

Liang, Baoyu, Su, Qile, Zhu, Shoutai, Liang, Yuchen, Tong, Chao

arXiv.org Artificial Intelligence

Despite the significant impact of visual events on human cognition, understanding events in videos remains a challenging task for AI due to their complex structures, semantic hierarchies, and dynamic evolution. To address this, we propose the task of video event understanding that extracts event scripts and makes predictions with these scripts from videos. To support this task, we introduce VidEvent, a large-scale dataset containing over 23,000 well-labeled events, featuring detailed event structures, broad hierarchies, and logical relations extracted from movie recap videos. The dataset was created through a meticulous annotation process, ensuring high-quality and reliable event data. We also provide comprehensive baseline models offering detailed descriptions of their architecture and performance metrics. These models serve as benchmarks for future research, facilitating comparisons and improvements. Our analysis of VidEvent and the baseline models highlights the dataset's potential to advance video event understanding and encourages the exploration of innovative algorithms and models. The dataset and related resources are publicly available at www.videvent.top.


Spatially Directional Dual-Attention GAT for Spatial Fluoride Health Risk Modeling

Yuan, Da

arXiv.org Artificial Intelligence

Environmental exposure to fluoride is a major public health concern, particularly in regions with naturally elevated fluoride concentrations. Accurate modeling of fluoride-related health risks, such as dental fluorosis, requires spatially aware learning frameworks capable of capturing both geographic and semantic heterogeneity. In this work, we propose Spatially Directional Dual-Attention Graph Attention Network (SDD-GAT), a novel spatial graph neural network designed for fine-grained health risk prediction. SDD-GAT introduces a dual-graph architecture that disentangles geographic proximity and attribute similarity, and incorporates a directional attention mechanism that explicitly encodes spatial orientation and distance into the message passing process. To further enhance spatial coherence, we introduce a spatial smoothness regularization term that enforces consistency in predictions across neighboring locations. We evaluate SDD-GAT on a large-scale dataset covering over 50,000 fluoride monitoring samples and fluorosis records across Guizhou Province, China. Results show that SDD-GAT significantly outperforms traditional models and state-of-the-art GNNs in both regression and classification tasks, while also exhibiting improved spatial autocorrelation as measured by Moran's I. Our framework provides a generalizable foundation for spatial health risk modeling and geospatial learning under complex environmental settings.